37 research outputs found

    Network analyses of proteome evolution and diversity

    Full text link
    The mapping of biomolecular interactions reveals that the function of most biological components depends on a web of interrelations with other cellular components, stressing the need for a systems-level view of biological functions. In this work, I explore ways in which the integration of network and genomic information from different organizational levels can lead to a better understanding of cellular systems and components. First, studying yeast, I show that the evolutionary properties of target genes constitute the dominant determinant of transcription factor (TF) evolutionary rate and that this evolutionary modularity is limited to activating regulatory relationships. I also show that targets of fast-evolving TFs show greater evolutionary expression changes and are enriched for niche-specific functions and other TFs. This work highlights the importance of trans-regulatory network evolution in species-specific gene expression and network adaptation. Next, I show that genes either lost or gained across fungal evolution are enriched in TFs and have very different network and genomic properties than universally conserved genes, including, in sharp contrast to other networks, a greater number of transcriptional regulators. Placing genes in the context of their evolutionary life-cycle reveals principles of network integration of gained genes and evidence for the progressive network and functional marginalization of genes as an evolutionary process preceding gene loss. In the final chapter, I study how alternative splicing (AS)-driven expansion of human proteome diversity leads to system-level complexity through the AS-mediated rewiring of the protein-protein interaction network. By overlaying different network and genomic datasets onto the first large-scale isoform-resolution interactome, I found that differentiating between splice variants is essential to capturing the full extent of the network's functional modularity. I also discovered that AS-mediated rewiring preferentially affects tissue-specific genes and that topologically different patterns of rewiring have distinct functional consequences. Furthermore, I found that most rewiring can be traced to the AS of evolutionarily conserved sequence modules, which promote or block interactions and tend to overlap linear motifs and disrupt known domain-domain interactions. Together, this work demonstrates that a network-level perspective and genomic data integration are essential to understanding the evolution and functional diversity of proteomes

    Comparison of Affymetrix Gene Array with the Exon Array shows potential application for detection of transcript isoform variation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The emergence of isoform-sensitive microarrays has helped fuel in-depth studies of the human transcriptome. The Affymetrix GeneChip Human Exon 1.0 ST Array (Exon Array) has been previously shown to be effective in profiling gene expression at the isoform level. More recently, the Affymetrix GeneChip Human Gene 1.0 ST Array (Gene Array) has been released for measuring gene expression and interestingly contains a large subset of probes from the Exon Array. Here, we explore the potential of using Gene Array probes to assess expression variation at the sub-transcript level. Utilizing datasets of the high quality Microarray Quality Control (MAQC) RNA samples previously assayed on the Exon Array and Gene Array, we compare the expression measurements of the two platforms to determine the performance of the Gene Array in detecting isoform variations.</p> <p>Results</p> <p>Overall, we show that the Gene Array is comparable to the Exon Array in making gene expression calls. Moreover, to examine expression of different isoforms, we modify the Gene Array probe set definition file to enable summarization of probe intensity values at the exon level and show that the expression profiles between the two platforms are also highly correlated. Next, expression calls of previously known differentially spliced genes were compared and also show concordant results. Splicing index analysis, representing estimates of exon inclusion levels, shows a lower but good correlation between platforms. As the Gene Array contains a significant subset of probes from the Exon Array, we note that, in comparison, the Gene Array overlaps with fewer but still a high proportion of splicing events annotated in the Known Alt Events UCSC track, with abundant coverage of cassette exons. We discuss the ability of the Gene Array to detect alternative splicing and isoform variation and address its limitations.</p> <p>Conclusion</p> <p>The Gene Array is an effective expression profiling tool at gene and exon expression level, the latter made possible by probe set annotation modifications. We demonstrate that the Gene Array is capable of detecting alternative splicing and isoform variation. As expected, in comparison to the Exon Array, it is limited by reduced gene content coverage and is not able to detect as wide a range of alternative splicing events. However, for the events that can be monitored by both platforms, we estimate that the selectivity and sensitivity levels are comparable. We hope our findings will shed light on the potential extension of the Gene Array to detect alternative splicing. It should be particularly suitable for researchers primarily interested in gene expression analysis, but who may be willing to look for splicing and isoform differences within their dataset. However, we do not suggest it to be an equivalent substitute to the more comprehensive Exon Array.</p

    RNA-Seq identifies SPGs as a ventral skeletal patterning cue in sea urchins

    Full text link
    The sea urchin larval skeleton offers a simple model for formation of developmental patterns. The calcium carbonate skeleton is secreted by primary mesenchyme cells (PMCs) in response to largely unknown patterning cues expressed by the ectoderm. To discover novel ectodermal cues, we performed an unbiased RNA-Seq-based screen and functionally tested candidates; we thereby identified several novel skeletal patterning cues. Among these, we show that SLC26a2/7 is a ventrally expressed sulfate transporter that promotes a ventral accumulation of sulfated proteoglycans, which is required for ventral PMC positioning and skeletal patterning. We show that the effects of SLC perturbation are mimicked by manipulation of either external sulfate levels or proteoglycan sulfation. These results identify novel skeletal patterning genes and demonstrate that ventral proteoglycan sulfation serves as a positional cue for sea urchin skeletal patterning

    Fine-Scale Variation and Genetic Determinants of Alternative Splicing across Individuals

    Get PDF
    Recently, thanks to the increasing throughput of new technologies, we have begun to explore the full extent of alternative pre–mRNA splicing (AS) in the human transcriptome. This is unveiling a vast layer of complexity in isoform-level expression differences between individuals. We used previously published splicing sensitive microarray data from lymphoblastoid cell lines to conduct an in-depth analysis on splicing efficiency of known and predicted exons. By combining publicly available AS annotation with a novel algorithm designed to search for AS, we show that many real AS events can be detected within the usually unexploited, speculative majority of the array and at significance levels much below standard multiple-testing thresholds, demonstrating that the extent of cis-regulated differential splicing between individuals is potentially far greater than previously reported. Specifically, many genes show subtle but significant genetically controlled differences in splice-site usage. PCR validation shows that 42 out of 58 (72%) candidate gene regions undergo detectable AS, amounting to the largest scale validation of isoform eQTLs to date. Targeted sequencing revealed a likely causative SNP in most validated cases. In all 17 incidences where a SNP affected a splice-site region, in silico splice-site strength modeling correctly predicted the direction of the micro-array and PCR results. In 13 other cases, we identified likely causative SNPs disrupting predicted splicing enhancers. Using Fst and REHH analysis, we uncovered significant evidence that 2 putative causative SNPs have undergone recent positive selection. We verified the effect of five SNPs using in vivo minigene assays. This study shows that splicing differences between individuals, including quantitative differences in isoform ratios, are frequent in human populations and that causative SNPs can be identified using in silico predictions. Several cases affected disease-relevant genes and it is likely some of these differences are involved in phenotypic diversity and susceptibility to complex diseases

    Intron loss and gain in Eukaryotes

    No full text
    Although introns were first discovered almost 30 years ago, their evolutionary origin and function remains elusive. In this thesis, I describe a referenced-based intron mapping method based on multi-species whole-genome alignments. We applied this method in two distinct studies. First we studied intron loss and gain dynamics in mammals and subsequently in Drosophila. We mapped known human introns onto the mouse, rat and dog genomes, mouse introns onto the human genome and Drosophila melanogaster introns onto 10 other fully sequenced Drosophila genomes. This genome-wide approach allowed us to assess the presence or absence of over 150,000 known human introns across four mammalian species and more than 35,000 D. melanogaster introns across 11 fruit fly species. We inferred 122 intron loss events in mammals and no intron gain events. In flies, we were able to identify 1754 intron loss events and 213 gain events. In both studies we found that lost introns tend to be extremely short and show higher than average similarity between their 5' splice-site sequence and the 3' partner splice-site sequence. We also demonstrate that losses in mammals occur preferentially in highly expressed house-keeping genes, while in Drosophila we show that lost and gained introns are flanked by longer than average exons, display quite distinct phase distributions and losses demonstrate significant clustering within genes. Across flies, it appears introns that have been lost evolve faster than other introns while they occur in slowly evolving genes. Our results in both studies strongly support the cDNA recombination mechanism of intron loss. The results in flies also suggest that selective pressures affect site-specific loss rates and show that intron gain has occurred within the Drosophila lineage, solidifying the “introns-middle” hypothesis and providing some hints about the gain mechanism and origin of introns.Malgré le fait que les introns furent découverts il y a près de 30 ans, leur origine et leur fonction nous échappent encore. Au cours de cette thèse, je décrirais une méthode qui permet de projeter des introns d'une espèce de référence sur d'autres génomes, basée sur des alignements de génomes complets à plusieurs espèces. Nous avons appliqué cette méthode dans le cadre de deux études distinctes. Premièrement, nous avons étudié les pertes et les gains d'introns chez les mammifères et ensuite chez les Drosophiles. Nous avons projeté les introns humains sur le génome de la souris, du rat et du chien, les introns de la souris sur le génome humain et les introns de la Drosophile melanogaster sur les génomes de 10 autres espèces de Drosophiles complètement séquencées. Cette approche d'ordre génomique nous a permis de comparer la présence ou l'absence de plus de 150,000 introns humains dans quatre espèces de mammifères et plus de 35,000 introns de D. melanogaster dans 11 espèces de drosophiles. Nous avons détecté 122 pertes d'introns chez les mammifères mais aucun gain d'intron. Chez les mouches à fruits, nous avons identifié 1754 pertes d'introns et 213 gains d'introns. Dans les deux études, nous démontrons que les introns perdus sont extrêmement courts et démontrent une similarité relativement élevée entre le site d'épissage au début de l'intron et le site d'épissage à la fin de l'intron. Nous démontrons chez les mammifères les pertes d'introns se produisent de préférence dans des gènes hautement exprimés et de fonctions cruciales à la cellule. Chez les drosophiles nous démontrons que les introns perdus ou gagnés sont délimités par des exons plus longs que la moyenne, ont une distribution de phase plutôt distincte et les pertes démontrent une tendance à se retrouver en groupe à l'intérieur des gènes. Chez les mouches à fruits, il semble que les introns perdus évoluent plus rapidement que la moyenn

    Regulatory Network Structure as a Dominant Determinant of Transcription Factor Evolutionary Rate

    Get PDF
    <div><p>The evolution of transcriptional regulatory networks has thus far mostly been studied at the level of <em>cis</em>-regulatory elements. To gain a complete understanding of regulatory network evolution we must also study the evolutionary role of <em>trans</em>-factors, such as transcription factors (TFs). Here, we systematically assess genomic and network-level determinants of TF evolutionary rate in yeast, and how they compare to those of generic proteins, while carefully controlling for differences of the TF protein set, such as expression level. We found significantly distinct trends relating TF evolutionary rate to mRNA expression level, codon adaptation index, the evolutionary rate of physical interaction partners, and, confirming previous reports, to protein-protein interaction degree and regulatory in-degree. We discovered that for TFs, the dominant determinants of evolutionary rate lie in the structure of the regulatory network, such as the median evolutionary rate of target genes and the fraction of species-specific target genes. Decomposing the regulatory network by edge sign, we found that this modular evolution of TFs and their targets is limited to activating regulatory relationships. We show that fast evolving TFs tend to regulate other TFs and niche-specific processes and that their targets show larger evolutionary expression changes than targets of other TFs. We also show that the positive trend relating TF regulatory in-degree and evolutionary rate is likely related to the species-specificity of the transcriptional regulation modules. Finally, we discuss likely causes for TFs' different evolutionary relationship to the physical interaction network, such as the prevalence of transient interactions in the TF subnetwork. This work suggests that positive and negative regulatory networks follow very different evolutionary rules, and that transcription factor evolution is best understood at a network- or systems-level.</p> </div

    De skal være chikt

    No full text
    <div><p>Gene gain and loss shape both proteomes and the networks they form. The increasing availability of closely related sequenced genomes and of genome-wide network data should enable a better understanding of the evolutionary forces driving gene gain, gene loss and evolutionary network rewiring. Using orthology mappings across 23 ascomycete fungi genomes, we identified proteins that were lost, gained or universally conserved across the tree, enabling us to compare genes across all stages of their life-cycle. Based on a collection of genome-wide network and gene expression datasets from baker’s yeast, as well as a few from fission yeast, we found that gene loss is more strongly associated with network and expression features of closely related species than that of distant species, consistent with the evolutionary modulation of gene loss propensity through network rewiring. We also discovered that lost and gained genes, as compared to universally conserved “core” genes, have more regulators, more complex expression patterns and are much more likely to encode for transcription factors. Finally, we found that the relative rate of network integration of new genes into the different types of networks agrees with experimentally measured rates of network rewiring. This systems-level view of the life-cycle of eukaryotic genes suggests that the gain and loss of genes is tightly coupled to the gain and loss of network interactions, that lineage-specific adaptations drive regulatory complexity and that the relative rates of integration of new genes are consistent with network rewiring rates.</p></div

    Inferred gene loss and gain events displayed along the yeast phylogenetic tree and how we inferred the life stage of different genes based on the phylogenetic location of their loss or gain events.

    No full text
    <p>The “+” sign denotes gains and the “-”sign, losses. Recent gains and local losses were defined as those having occurred after the split with <i>K</i>. <i>waltii</i>, with the exception of genes gained in <i>S</i>. <i>cerevisiae</i>.</p
    corecore